B
binary search tree
binary tree
binary tree sort
C
child node
circular, doubly-linked list
circular, singly-linked list
D
deleting an item from a binary tree
dequeue
doubly-linked list
duplicate elimination
dynamic data structures
dynamic memory allocation
E
enqueue
F
first-in first-out (FIFO)
H
head of a queue
I
inorder traversal
insertion
L
last-in first-out (LIFO) data structure
leaf node
left child
left subtree
level-order binary tree traversal
linear data structure
linked list
N
node
nonlinear, two- dimensional data structure
null pointer (0)
P
parent node
pop
postorder traversal
predicate function
preorder traversal
push
Q
queue
R
right child
right subtree
root node
S
self-referential class
sibling
singly-linked list
sizeof
stack
T
tail of a queue
top of a stack
traversal of a binary tree
tree

A class object's size is not necessarily the sum of the sizes of its data members. This is because of various machine-dependent boundary alignment requirements (see Chapter 16) and other reasons. Use operator
sizeof to determine an object's size.

Much that I bound, I could not free;
Much that I freed returned to me.
Lee Wilson Dodd

'Will you walk a little faster?' said a whiting to a snail, 'There's a porpoise close behind us, and he's treading on my tail.'
Lewis Carroll

There is always room at the top.
Daniel Webster

Push on -- keep moving.
Thomas Morton

I think that I shall never see A poem lovely as a tree.
Joyce Kilmer

Fig. 15.1 Two self-referential class objects linked together.
Fig. 15.2 A graphical representation of a list.
Fig. 15.3 Manipulating a linked list.
Fig. 15.5 The insertAtFront operation.
Fig. 15.6 A graphical representation of the insertAtBack operation.
Fig. 15.7 A graphical representation of the removeFromFront operation.
Fig. 15.8 A graphical representation of the removeFromBack operation.
Fig. 15.9 A simple stack program.
Fig. 15.11 A simple stack program using composition.
Fig. 15.12 Processing a queue.
Fig. 15.14 A graphical representation of a binary tree.
Fig. 15.15 A binary search tree.
Fig. 15.16 Creating and traversing a binary tree.
Fig. 15.18 A binary search tree.
Fig. 15.19 A 15-node binary search tree.
Fig. 15.20 Simple commands.
Fig. 15.21 Simple program that determines the sum of two integers.
Fig. 15.22 Simple program that finds the larger of two integers.
Fig. 15.23 Calculate the squares of several integers.
Fig. 15.24 Writing, compiling, and executing a Simple language program.
Fig. 15.25 SML instructions produced after the compiler's first pass.
Fig. 15.26 Symbol table for program of Fig. 15.25.
Fig. 15.27 Unoptimized code from the program of Fig. 15.25.
Fig. 15.28 Optimized code for the program of Fig. 15.25.

Figure 15.1 - Two self-referential class objects linked together.
Figure 15.2 - A graphical representation of a list.
Figure 15.5 - The insertAtFront operation.
Figure 15.6 - A graphical representation of the insertAtBack operation.
Figure 15.7 - A graphical representation of the removeFromFront operation.
Figure 15.8 - A graphical representation of the removeFromBack operation.
Figure 15.15 - A binary search tree.
Figure 15.14 - A graphical representation of a binary tree.
Figure 15.18 - A binary search tree.
Figure 15.19 - A 15-node binary search tree.
Figure 15.20 - Simple commands.
Figure 15.24 - Writing, compiling, and executing a Simple language program.
Figure 15.26 - Symbol table for program of Fig. 15.25.
Figure 15.25 - SML instructions produced after the compiler's first pass. (Part 1 of 2)

Figure 15.25 - SML instructions produced after the compiler's first pass. (Part 2 of 2)

Figure 15.28 - Optimized code for the program of Fig. 15.25. (Part 1 of 2)
Figure 15.28 - Optimized code for the program of Fig. 15.25. (Part 2 of 2)
Figure 15.27 - Unoptimized code from the program of Fig. 15.25.
Answer 15.1
a) referential.
b) new.
c) stack.
d) predicate functions.
e) first-in-first-out (FIFO).
f) link.
g) delete.
h) queue.
i) tree.
j) last-in-first-out (LIFO).
k) binary.
l) root.
m) child or subtree.
n) leaf.
o) inorder, preorder, and postorder.


Answer 15.2
It is possible to insert a node anywhere in a linked list and remove a node from anywhere in a linked list. Nodes in a stack may only be inserted at the top of the stack and removed from the top of a stack.


Answer 15.3
A queue has pointers to both its head and its tail so that nodes may be inserted at the tail and deleted from the head. A stack has a single pointer to the top of the stack where both insertion and deletion of nodes are performed.

Answer 15.4
a) Classes allow us to instantiate as many data structure objects of a certain type (i.e., class) as we wish.
b) Class templates enable us to instantiate related classes--each based on different type parameters--we can then generate as many objects of each template class as we like.
c) Inheritance enables us to reuse code from a base class in a derived class so that the derived-class data structure is also a base-class data structure (with public inheritance, that is).

d) Private inheritance enables us to reuse portions of the code from a base class to form a derived-class data structure; because the inheritance is private, all public base-class member functions become private in the derived class. This enables us to prevent clients of the derived-class data structure from accessing base-class member functions that do not apply to the derived class.
e) Composition enables us to reuse code by making a class object data structure a member of a composed class; if we make the class object a private member of the composed class, then the class object's public member functions are not available through the composed object's interface.
Answer 15.5
The inorder traversal is:

11 18 19 28 32 40 44 49 69 71 72 83 92 97 99

The preorder traversal is:

49 28 18 11 19 40 32 44 83 71 69 72 97 92 99

The postorder traversal is:

11 19 18 32 44 40 28 69 72 71 92 99 97 83 49

Answer 15.6
The solution to this exercise can be found on your Cyber Classroom CD. Copy the file cpphtp2/answers/P15_06.zip to your hard drive and unzip the program code.

Answer 15.8
The solution to this exercise can be found on your Cyber Classroom CD. Copy the file cpphtp2/answers/P15_08.zip to your hard drive and unzip the program code.

Answer 15.10
The solution to this exercise can be found on your Cyber Classroom CD. Copy the file cpphtp2/answers/P15_10.zip to your hard drive and unzip the program code.

Answer 15.11
The solution to this exercise can be found on your Cyber Classroom CD. Copy the file cpphtp2/answers/P15_11.zip to your hard drive and unzip the program code.

Answer 15.17
The solution to this exercise can be found on your Cyber Classroom CD. Copy the file cpphtp2/answers/P15_17.zip to your hard drive and unzip the program code.

Answer 15.20
The solution to this exercise can be found on your Cyber Classroom CD. Copy the file cpphtp2/answers/P15_20.zip to your hard drive and unzip the program code.

Answer 15.24
The solution to this exercise can be found on your Cyber Classroom CD. Copy the file cpphtp2/answers/P15_24.zip to your hard drive and unzip the program code.

Exercise 15.1
Fill in the blanks in each of the following:
a) A self-________ class is used to form dynamic data structures. that can grow and shrink at execution time
b) Operator ________ is used to dynamically allocate memory; this operator returns a pointer to the allocated memory.
c) A ________ is a constrained version of a linked list in which nodes can be inserted and deleted only from the start of the list; this data structure returns node values in last-in-first-out order.
d) A function that does not alter a linked list, but simply looks at the list to determine if it is empty is referred to as a ________function.
e) A queue is referred to as a ________ data structure because the first nodes inserted are the first nodes removed.
f) The pointer to the next node in a linked list is referred to as a ________.
g) Operator ________ is used to reclaim dynamically allocated memory.
h) A ________ is a constrained version of a linked list in which nodes can be inserted only at the end of the list and deleted only from the start of the list.
i) A ________ is a nonlinear, two-dimensional data structure that contains nodes with two or more links.
j) A stack is referred to as a ________ data structure because the last node inserted is the first node removed.
k) The nodes of a ________ tree contain two link members.
l) The first node of a tree is the ________ node.
m) Each link in a tree node points to a ________ or ________ of that node.
n) A tree node that has no children is called a ________ node.
o) The four traversal algorithms we mentioned in the text for binary search trees are ______, ______, _______ and ______.


Exercise 15.2
What are the differences between a linked list and a stack?


Exercise 15.3
What are the differences between a stack and a queue?


Exercise 15.4
Perhaps a more appropriate title for this chapter would have been "Reusable Data Structures." Comment on how each of the following entities or concepts contributes to the reusability of data structures:
a) classes
b) template classes
c) inheritance
d) private inheritance
e) composition.
Exercise 15.5
Manually provide the inorder, preorder, and postorder traversals of the binary search tree of Fig. 15.19.


Exercise 15.6
Write a program that concatenates two linked list objects of characters. The program should include function concatenate that takes references to both list objects as arguments and concatenates the second list to the first list.


Exercise 15.7
Write a program that merges two ordered list objects of integers into a single ordered list object of integers. Function merge should receive references to each of the list objects to be merged, and should return a reference to the merged list object.


Exercise 15.8
Write a program that inserts 25 random integers from 0 to 100 in order in a linked list object. The program should calculate the sum of the elements, and the floating-point average of the elements.


Exercise 15.9
Write a program that creates a linked list object of 10 characters, then creates a second list object containing a copy of the first list, but in reverse order.


Exercise 15.10
Write a program that inputs a line of text and uses a stack object to print the line reversed.


Exercise 15.11
Write a program that uses a stack object to determine if a string is a palindrome (i.e., the string is spelled identically backwards and forwards). The program should ignore spaces and punctuation.


Exercise 15.12
Stacks are used by compilers to help in the process of evaluating expressions and generating machine language code. In this and the next exercise, we investigate how compilers evaluate arithmetic expressions consisting only of constants, operators, and parentheses.
Humans generally write expressions like 3 + 4 and 7 / 9 in which the operator (+ or / here) is written between its operands--this is called infix notation. Computers "prefer" postfix notation in which the operator is written to the right of its two operands. The preceding infix expressions would appear in postfix notation as 3 4 + and 7 9 /, respectively.
To evaluate a complex infix expression, a compiler would first convert the expression to postfix notation, and then evaluate the postfix version of the
expression. Each of these algorithms requires only a single left-to-right pass of the expression. Each algorithm uses a stack object in support of its operation, and in each algorithm the stack is used for a different purpose.
In this exercise, you will write a C++ version of the infix-to-postfix conversion algorithm. In the next exercise, you will write a C++ version of the postfix expression evaluation algorithm. Later in the chapter, you will discover that code you write in this exercise can help you implement a complete working compiler.
Write a program that converts an ordinary infix arithmetic expression (assume a valid expression is entered) with single digit integers such as

(6 + 2) * 5 - 8 / 4

to a postfix expression. The postfix version of the preceding infix expression is

6 2 + 5 * 8 4 / -

The program should read the expression into character array infix, and use modified versions of the stack functions implemented in this chapter to help create the postfix expression in character array postfix. The algorithm for creating a postfix expression is as follows:
1. Push a left parenthesis '(' onto the stack.
2. Append a right parenthesis ')' to the end of infix.
While the stack is not empty, read infix from left to right and do the following:

If the current character in infix is a digit, copy it to the next element of postfix.

If the current character in infix is a left parenthesis, push it onto the stack.

If the current character in infix is an operator,


3. Pop operators (if there are any) at the top of the stack while they have equal or

higher precedence than the current operator, and insert the popped

operators in postfix.

Push the current character in infix onto the stack.

If the current character in infix is a right parenthesis

Pop operators from the top of the stack and insert them in postfix until a left

parenthesis is at the top of the stack.

Pop (and discard) the left parenthesis from the stack.
The following arithmetic operations are allowed in an expression:
+addition
-subtraction
*multiplication
/division
^exponentiation
%modulus
The stack should be maintained with stack nodes that each contain a data member and a pointer to the next stack node.
Some of the functional capabilities you may want to provide are:
a) Function convertToPostfix that converts the infix expression to postfix notation.
b) Function isOperator that determines if c is an operator.
c) Function precedence that determines if the precedence of operator1 is less than, equal to, or greater than the precedence of operator2. The function returns -1, 0, and 1, respectively.
d) Function push that pushes a value onto the stack.
e) Function pop that pops a value off the stack.
f) Function stackTop that returns the top value of the stack without popping the stack.
g) Function isEmpty that determines if the stack is empty.
h) Function printStack that prints the stack.


Exercise 15.13
Write a program that evaluates a postfix expression (assume it is valid) such as

6 2 + 5 * 8 4 / -

The program should read a postfix expression consisting of digits and operators into a character array. Using modified versions of the stack functions implemented earlier in this chapter, the program should scan the expression and evaluate it. The algorithm is as follows:
1. Append the null character ('\\0') to the end of the postfix expression. When the null character is encountered, no further processing is necessary.
While '\\0' has not been encountered, read the expression from left to right.

If the current character is a digit,

Push its integer value onto the stack (the integer value of a digit character is
2. its

value in the computer's character set minus the value of ' 0' in the computer's

character set).

Otherwise, if the current character is an operator,

Pop the two top elements of the stack into variables x and y.

Calculate y operator x.

Push the result of the calculation onto the stack.
3. When the null character is encountered in the expression, pop the top value of the stack. This is the result of the postfix expression.
Note: In 2) above, if the operator is '/', the top of the stack is 2, and the next element in the stack is 8, then pop 2 into x, pop 8 into y, evaluate 8 / 2, and push
the result, 4, back onto the stack. This note also applies to operator '-'. The arithmetic operations allowed in an expression are:
+addition
-subtraction
*multiplication
/division
^exponentiation
%modulus
The stack should be maintained with stack nodes that contain an int data member and a pointer to the next stack node. You may want to provide the following functional capabilities:
a) Function evaluatePostfixExpression that evaluates the postfix expression.
b) Function calculate that evaluates the expression op1 operator op2.
c) Function push that pushes a value onto the stack.
d) Function pop that pops a value off the stack.
e) Function isEmpty that determines if the stack is empty.
f) Function printStack that prints the stack.


Exercise 15.14
Modify the postfix evaluator program of Exercise 15.13 so that it can process integer operands larger than 9.


Exercise 15.15
(Supermarket simulation) Write a program that simulates a check-out line at a supermarket. The line is a queue object. Customers (i.e., customer objects) arrive in random integer intervals of 1 to 4 minutes. Also, each customer is serviced in random integer intervals of 1 to 4 minutes. Obviously, the rates need to be balanced. If the average arrival rate is larger than the average service rate, the queue will grow infinitely. Even with "balanced" rates, randomness can still cause long lines. Run the supermarket simulation for a 12-hour day (720 minutes) using the following algorithm:
1. Choose a random integer between 1 and 4 to determine the minute at which the first customer arrives.
2. At the first customer's arrival time:

Determine customer's service time (random integer from 1 to 4);

Begin servicing the customer;

Schedule arrival time of next customer (random integer 1 to 4 added to the current time).
For each minute of the day:

If the next customer arrives,

Say so,

Enqueue the customer;

Schedule the arrival time of the next customer;


3. If service was completed for the last customer;

Say so

Dequeue next customer to be serviced

Determine customer's service completion time (random integer from 1 to 4

added to the current time).
Now run your simulation for 720 minutes and answer each of the following:
a) What is the maximum number of customers in the queue at any time?
b) What is the longest wait any one customer experiences?
c) What happens if the arrival interval is changed from 1-to-4 minutes to 1-to-3 minutes?


Exercise 15.16
Modify the program of Fig. 15.16 to allow the binary tree object to contain duplicates.


Exercise 15.17
Write a program based on the program of Fig. 15.16 that inputs a line of text, tokenizes the sentence into separate words (you may want to use the strtok library function), inserts the words in a binary search tree, and prints the inorder, preorder, and postorder traversals of the tree. Use an OOP approach.


Exercise 15.18
In this chapter, we saw that duplicate elimination is straightforward when creating a binary search tree. Describe how you would perform duplicate elimination using only a single-subscripted array. Compare the performance of array-based duplicate elimination with the performance of binary-search-tree- based duplicate elimination.

Exercise 15.19
Write a function depth that receives a binary tree and determines how many levels it has.


Exercise 15.20
(Recursively print a list backwards) Write a member function printListBackwards that recursively outputs the items in a linked list object in reverse order. Write a test program that creates a sorted list of integers and prints the list in reverse order.


Exercise 15.21
(Recursively search a list) Write a member function searchList that recursively searches a linked list object for a specified value. The function should return a pointer to the value if it is found; otherwise, null should be returned. Use your function in a test program that creates a list of integers. The program should prompt the user for a value to locate in the list.


Exercise 15.22
(Binary tree delete) In this exercise, we discuss deleting items from binary search trees. The deletion algorithm is not as straightforward as the insertion algorithm. There are three cases that are encountered when deleting an item-- the item is contained in a leaf node (i.e., it has no children), the item is contained in a node that has one child, or the item is contained in a node that has two children.
If the item to be deleted is contained in a leaf node, the node is deleted and the pointer in the parent node is set to null.
If the item to be deleted is contained in a node with one child, the pointer in the parent node is set to point to the child node and the node containing the data item
is deleted. This causes the child node to take the place of the deleted node in the tree.
The last case is the most difficult. When a node with two children is deleted, another node in the tree must take its place. However, the pointer in the parent node cannot simply be assigned to point to one of the children of the node to be deleted. In most cases, the resulting binary search tree would not adhere to the following characteristic of binary search trees (with no duplicate values): The values in any left subtree are less than the value in the parent node, and the values in any right subtree are greater than the value in the parent node.
Which node is used as a replacement node to maintain this characteristic? Either the node containing the largest value in the tree less than the value in the node being deleted, or the node containing the smallest value in the tree greater than
the value in the node being deleted. Let us consider the node with the smaller value. In a binary search tree, the largest value less than a parent's value is located in the left subtree of the parent node and is guaranteed to be contained in the rightmost node of the subtree. This node is located by walking down the left subtree to the right until the pointer to the right child of the current node is null. We are now pointing to the replacement node which is either a leaf node or a node with one child to its left. If the replacement node is a leaf node, the steps to perform the deletion are as follows:
1. Store the pointer to the node to be deleted in a temporary pointer variable (this pointer is used to delete the dynamically allocated memory)
2. Set the pointer in the parent of the node being deleted to point to the replacement node
3. Set the pointer in the parent of the replacement node to null
4. Set the pointer to the right subtree in the replacement node to point to the right subtree of the node to be deleted
5. Delete the node to which the temporary pointer variable points.
The deletion steps for a replacement node with a left child are similar to those for a replacement node with no children, but the algorithm also must move the child in to the replacement node's position in the tree. If the replacement node is a node with a left child, the steps to perform the deletion are as follows:
1. Store the pointer to the node to be deleted in a temporary pointer variable
2. Set the pointer in the parent of the node being deleted to point to the replacement node
3. Set the pointer in the parent of the replacement node to point to the left child of the replacement node
4. Set the pointer to the right subtree in the replacement node to point to the right subtree of the node to be deleted
5. Delete the node to which the temporary pointer variable points.
Write member function deleteNode which takes as its arguments a pointer to the root node of the tree object and the value to be deleted. The function should locate in the tree the node containing the value to be deleted and use the algorithms discussed here to delete the node. If the value is not found in the tree, the function should print a message that indicates whether or not the value is deleted. Modify the program of Fig. 15.16 to use this function. After deleting an
item, call the inOrder, preOrder, and postOrder traversal functions to confirm that the delete operation was performed correctly.


Exercise 15.23
(Binary tree search) Write member function binaryTreeSearch that attempts to locate a specified value in a binary search tree object. The function should take as arguments a pointer to the root node of the binary tree and a search key to be located. If the node containing the search key is found, the function should return a pointer to that node; otherwise, the function should return a null pointer.


Exercise 15.24
(Level-order binary tree traversal) The program of Fig. 15.16 illustrated three recursive methods of traversing a binary tree--inorder, preorder, and postorder traversals. This exercise presents the level-order traversal of a binary tree in which the node values are printed level-by-level starting at the root node level. The nodes on each level are printed from left to right. The level-order traversal is not a recursive algorithm. It uses a queue object to control the output of the nodes. The algorithm is as follows:
1. Insert the root node in the queue
2. While there are nodes left in the queue,

Get the next node in the queue

Print the node's value
If the pointer to the left child of the node is not null

Insert the left child node in the queue

If the pointer to the right child of the node is not null

Insert the right child node in the queue.
Write member function levelOrder to perform a level-order traversal of a binary tree object. Modify the program of Fig 15.16 to use this function. (Note: You will also need to modify and incorporate the queue processing functions of Fig. 15.12 in this program.)
Exercise 15.25
(Printing trees) Write a recursive member function outputTree to display a binary tree object on the screen. The function should output the tree row-by-row with the top of the tree at the left of the screen and the bottom of the tree toward the right of the screen. Each row is output vertically. For example, the binary tree illustrated in Fig. 15.19 is output as follows:

Note the rightmost leaf node appears at the top of the output in the rightmost column and the root node appears at the left of the output. Each column of output starts five spaces to the right of the previous column. Function outputTree should receive an argument totalSpaces representing the number of spaces preceding the value to be output (this variable should start at zero so the root node is output at the left of the screen). The function uses a modified inorder traversal to output the tree--it starts at the rightmost node in the tree and works back to the left. The algorithm is as follows:
While the pointer to the current node is not null
Recursively call outputTree with the right subtree of the current node and totalSpaces + 5
Use a for structure to count from 1 to totalSpaces and output spaces
Output the value in the current node
Set the pointer to the current node to point to the left subtree of the current node Increment totalSpaces by 5.


Special Section: Building Your Own Compiler
In Exercises 5.18 and 5.19, we introduced Simpletron Machine Language (SML) and you implemented a Simpletron computer simulator to execute programs written in SML. In this section, we build a compiler that converts programs written in a high-level programming language to SML. This section "ties" together the entire programming process. You will write programs in this new high-level language, compile these programs on the compiler you build, and run the programs on the simulator you built in Exercise 7.19. You should make every effort to implement your compiler in an object-oriented manner.
Exercise 15.26
(The Simple Language) Before we begin building the compiler, we discuss a simple, yet powerful, high-level language similar to early versions of the popular language BASIC. We call the language Simple. Every Simple statement consists of a line number and a Simple instruction. Line numbers must appear in ascending order. Each instruction begins with one of the following Simple commands: rem, input, let, print, goto, if/goto, or end (see Fig. 15.20). All commands except end can be used repeatedly. Simple evaluates only integer expressions using the +, -, *, and / operators. These operators have the same precedence as in C. Parentheses can be used to change the order of evaluation of an expression.
Our Simple compiler recognizes only lowercase letters. All characters in a Simple file should be lowercase (uppercase letters result in a syntax error unless they appear in a rem statement in which case they are ignored). A variable name is a single letter. Simple does not allow descriptive variable names, so variables should be explained in remarks to indicate their use in a program. Simple uses only integer variables. Simple does not have variable declarations--merely mentioning a variable name in a program causes the variable to be declared and initialized to zero automatically. The syntax of Simple does not allow string manipulation (reading a string, writing a string, comparing strings, etc.). If a string is encountered in a Simple program (after a command other than rem), the compiler generates a syntax error. The first version of our compiler will assume
that Simple programs are entered correctly. Exercise 15.29 asks the student to modify the compiler to perform syntax error checking.
Simple uses the conditional if/goto statement and the unconditional goto statement to alter the flow of control during program execution. If the condition in the if/goto statement is true, control is transferred to a specific line of the program. The following relational and equality operators are valid in an if/goto statement: <, >, <=, >=, ==, or !=. The precedence of these operators is the same as in C++.
Let us now consider several programs that demonstrate Simple's features. The first program (Fig. 15.21) reads two integers from the keyboard, stores the values in variables a and b, and computes and prints their sum (stored in variable c).
The program of Fig. 15.22 determines and prints the larger of two integers. The integers are input from the keyboard and stored in s and t. The if/goto statement tests the condition s >= t. If the condition is true, control is transferred to line 90 and s is output; otherwise, t is output and control is transferred to the end statement in line 99 where the program terminates.
Simple does not provide a repetition structure (such as C++'s for, while, or do/ while). However, Simple can simulate each of C++'s repetition structures using the if/goto and goto statements. Figure 15.23 uses a sentinel-controlled loop to calculate the squares of several integers. Each integer is input from the keyboard and stored in variable j. If the value entered is the sentinel -9999, control is transferred to line 99 where the program terminates. Otherwise, k is assigned the
square of j, k is output to the screen, and control is passed to line 20 where the next integer is input.
Using the sample programs of Fig. 15.21, Fig. 15.22, and Fig. 15.23 as your guide, write a Simple program to accomplish each of the following:
a) Input three integers, determine their average, and print the result.
b) Use a sentinel-controlled loop to input 10 integers and compute and print their sum.
c) Use a counter-controlled loop to input 7 integers, some positive and some negative, and compute and print their average.
d) Input a series of integers and determine and print the largest. The first integer input indicates how many numbers should be processed.
e) Input 10 integers and print the smallest.
f) Calculate and print the sum of the even integers from 2 to 30.
g) Calculate and print the product of the odd integers from 1 to 9.


Exercise 15.27
(Building A Compiler; Prerequisite: Complete Exercises 5.18, 5.19, 15.12, 15.13, and 15.26) Now that the Simple language has been presented (Exercise 15.26), we discuss how to build a Simple compiler. First, we consider the process by which a Simple program is converted to SML and executed by the Simpletron simulator (see Fig. 15.24). A file containing a Simple program is read by the compiler and converted to SML code. The SML code is output to a file on disk, in which SML instructions appear one per line. The SML file is then loaded into the Simpletron simulator, and the results are sent to a file on disk and to the screen. Note that the Simpletron program developed in Exercise 5.19 took its input from the keyboard. It must be modified to read from a file so it can run the programs produced by our compiler.
The Simple compiler performs two passes of the Simple program to convert it to SML. The first pass constructs a symbol table (object) in which every line number (object), variable name (object) and constant (object) of the Simple program is stored with its type and corresponding location in the final SML code (the symbol table is discussed in detail below). The first pass also produces the corresponding SML instruction object(s) for each of the Simple statements (object, etc.). As we will see, if the Simple program contains statements that transfer control to a line later in the program, the first pass results in an SML program containing some "unfinished" instructions. The second pass of the compiler locates and completes the unfinished instructions, and outputs the SML program to a file.
First Pass
The compiler begins by reading one statement of the Simple program into memory. The line must be separated into its individual tokens (i.e., "pieces" of a statement) for processing and compilation (standard library function strtok can be used to facilitate this task). Recall that every statement begins with a line number followed by a command. As the compiler breaks a statement into tokens, if the token is a line number, a variable, or a constant, it is placed in the symbol table. A line number is placed in the symbol table only if it is the first token in a statement. The symbolTable object is an array of tableEntry objects representing each symbol in the program. There is no restriction on the number of symbols that can appear in the program. Therefore, the symbolTable for a
particular program could be large. Make the symbolTable a 100-element array for now. You can increase or decrease its size once the program is working.
Each tableEntry object contains three members. Member symbol is an integer containing the ASCII representation of a variable (remember that variable names are single characters), a line number, or a constant. Member type is one of the following characters indicating the symbol's type: 'C' for constant, 'L' for line number, or 'V' for variable. Member location contains the Simpletron memory location (00 to 99) to which the symbol refers. Simpletron memory is an array of 100 integers in which SML instructions and data are stored. For a line number, the location is the element in the Simpletron memory array at which the SML instructions for the Simple statement begin. For a variable or constant, the location is the element in the Simpletron memory array in which
the variable or constant is stored. Variables and constants are allocated from the end of Simpletron's memory backwards. The first variable or constant is stored in location at 99, the next in location at 98, etc.
The symbol table plays an integral part in converting Simple programs to SML. We learned in Chapter 5 that an SML instruction is a four-digit integer comprised of two parts--the operation code and the operand. The operation code is determined by commands in Simple. For example, the simple command input corresponds to SML operation code 10 (read), and the Simple command print corresponds to SML operation code 11 (write). The operand is a memory location containing the data on which the operation code performs its task (e.g., operation code 10 reads a value from the keyboard and stores it in the memory location specified by the operand). The compiler searches symbolTable to
determine the Simpletron memory location for each symbol so the corresponding location can be used to complete the SML instructions.
The compilation of each Simple statement is based on its command. For example, after the line number in a rem statement is inserted in the symbol table, the remainder of the statement is ignored by the compiler because a remark is for documentation purposes only. The input, print, goto and end statements correspond to the SML read, write, branch (to a specific location) and halt instructions. Statements containing these Simple commands are converted directly to SML (note that a goto statement may contain an unresolved reference if the specified line number refers to a statement further into the Simple program file; this is sometimes called a forward reference).
When a goto statement is compiled with an unresolved reference, the SML instruction must be flagged to indicate that the second pass of the compiler must complete the instruction. The flags are stored in 100-element array flags of type int in which each element is initialized to -1. If the memory location to which a line number in the Simple program refers is not yet known (i.e., it is not in the symbol table), the line number is stored in array flags in the element with the same subscript as the incomplete instruction. The operand of the incomplete instruction is set to 00 temporarily. For example, an unconditional branch instruction (making a forward reference) is left as +4000 until the second pass of the compiler. The second pass of the compiler will be described shortly.
Compilation of if/goto and let statements is more complicated than other statements--they are the only statements that produce more than one SML
instruction. For an if/goto statement, the compiler produces code to test the condition and to branch to another line if necessary. The result of the branch could be an unresolved reference. Each of the relational and equality operators can be simulated using SML's branch zero and branch negative instructions (or possibly a combination of both).
For a let statement, the compiler produces code to evaluate an arbitrarily complex arithmetic expression consisting of integer variables and/or constants. Expressions should separate each operand and operator with spaces. Exercises 15.12 and 15.13 presented the infix-to-postfix conversion algorithm and the postfix evaluation algorithm used by compilers to evaluate expressions. Before proceeding with your compiler, you should complete each of these exercises.
When a compiler encounters an expression, it converts the expression from infix notation to postfix notation, then evaluates the postfix expression.
How is it that the compiler produces the machine language to evaluate an expression containing variables? The postfix evaluation algorithm contains a "hook" where the compiler can generate SML instructions rather than actually evaluating the expression. To enable this "hook" in the compiler, the postfix evaluation algorithm must be modified to search the symbol table for each symbol it encounters (and possibly insert it), determine the symbol's corresponding memory location, and push the memory location onto the stack (instead of the symbol). When an operator is encountered in the postfix expression, the two memory locations at the top of the stack are popped and machine language for effecting the operation is produced using the memory
locations as operands. The result of each subexpression is stored in a temporary location in memory and pushed back onto the stack so the evaluation of the postfix expression can continue. When postfix evaluation is complete, the memory location containing the result is the only location left on the stack. This is popped and SML instructions are generated to assign the result to the variable at the left of the let statement.
Second Pass
The second pass of the compiler performs two tasks: resolve any unresolved references and output the SML code to a file. Resolution of references occurs as follows:
a) Search the flags array for an unresolved reference (i.e., an element with a value other than -1).
b) Locate the object in array symbolTable containing the symbol stored in the flags array (be sure that the type of the symbol is 'L' for line number).
c) Insert the memory location from member location into the instruction with the unresolved reference (remember that an instruction containing an unresolved reference has operand 00).
d) Repeat steps 1, 2, and 3 until the end of the flags array is reached.
After the resolution process is complete, the entire array containing the SML code is output to a disk file with one SML instruction per line. This file can be read by the Simpletron for execution (after the simulator is modified to read its input from a file). Compiling your first Simple program into an SML file and then executing that file should give you a real sense of personal accomplishment.
A Complete Example
The following example illustrates a complete conversion of a Simple program to SML as it will be performed by the Simple compiler. Consider a Simple program that inputs an integer and sums the values from 1 to that integer. The program and the SML instructions produced by the first pass of the Simple compiler are illustrated in Fig. 15.25. The symbol table constructed by the first pass is shown in Fig. 15.26.
Most Simple statements convert directly to single SML instructions. The exceptions in this program are remarks, the if/goto statement in line 20, and the let statements. Remarks do not translate into machine language. However, the line number for a remark is placed in the symbol table in case the line number is referenced in a goto statement or an if/goto statement. Line 20 of the program
specifies that if the condition y == x is true, program control is transferred to line 60. Because line 60 appears later in the program, the first pass of the compiler has not as yet placed 60 in the symbol table (statement line numbers are placed in the symbol table only when they appear as the first token in a statement). Therefore, it is not possible at this time to determine the operand of the SML branch zero instruction at location 03 in the array of SML instructions. The compiler places 60 in location 03 of the flags array to indicate that the second pass completes this instruction.
We must keep track of the next instruction location in the SML array because there is not a one-to-one correspondence between Simple statements and SML instructions. For example, the if/goto statement of line 20 compiles into three SML instructions. Each time an instruction is produced, we must increment the
instruction counter to the next location in the SML array. Note that the size of Simpletron's memory could present a problem for Simple programs with many statements, variables and constants. It is conceivable that the compiler will run out of memory. To test for this case, your program should contain a data counter to keep track of the location at which the next variable or constant will be stored in the SML array. If the value of the instruction counter is larger than the value of the data counter, the SML array is full. In this case, the compilation process should terminate and the compiler should print an error message indicating that it ran out of memory during compilation. This serves to emphasize that although the programmer is freed from the burdens of managing memory by the compiler, the compiler itself must carefully determine the placement of instructions and
data in memory, and must check for such errors as memory being exhausted during the compilation process.
A Step-by-Step View of the Compilation Process
Let us now walk through the compilation process for the Simple program in Fig. 15.25. The compiler reads the first line of the program

5 rem sum 1 to x

into memory. The first token in the statement (the line number) is determined using strtok (see Chapters 5 and 16 for a discussion of C++'s string manipulation functions). The token returned by strtok is converted to an integer using atoi so the symbol 5 can be located in the symbol table. If the symbol is not found, it is inserted in the symbol table. Since we are at the beginning of the program and this is the first line, no symbols are in the table yet. So, 5 is inserted into the
symbol table as type L (line number) and assigned the first location in SML array (00). Although this line is a remark, a space in the symbol table is still allocated for the line number (in case it is referenced by a goto or an if/goto). No SML instruction is generated for a rem statement, so the instruction counter is not incremented.
The statement

10 input x

is tokenized next. The line number 10 is placed in the symbol table as type L and assigned the first location in the SML array (00 because a remark began the program so the instruction counter is currently 00). The command input indicates that the next token is a variable (only a variable can appear in an input statement). Because input corresponds directly to an SML operation code, the
compiler simply has to determine the location of x in the SML array. Symbol x is not found in the symbol table. So, it is inserted into the symbol table as the ASCII representation of x, given type V, and assigned location 99 in the SML array (data storage begins at 99 and is allocated backwards). SML code can now be generated for this statement. Operation code 10 (the SML read operation code) is multiplied by 100, and the location of x (as determined in the symbol table) is added to complete the instruction. The instruction is then stored in the SML array at location 00. The instruction counter is incremented by 1 because a single SML instruction was produced.
The statement

15 rem   check y == x

is tokenized next. The symbol table is searched for line number 15 (which is not found). The line number is inserted as type L and assigned the next location in the array, 01 (remember that rem statements do not produce code, so the instruction counter is not incremented).
The statement

20 if y == x goto 60

is tokenized next. Line number 20 is inserted in the symbol table and given type L with the next location in the SML array 01. The command if indicates that a condition is to be evaluated. The variable y is not found in the symbol table, so it is inserted and given the type V and the SML location 98. Next, SML instructions are generated to evaluate the condition. Since there is no direct equivalent in SML for the if/goto, it must be simulated by performing a
calculation using x and y and branching based on the result. If y is equal to x, the result of subtracting x from y is zero, so the branch zero instruction can be used with the result of the calculation to simulate the if/goto statement. The first step requires that y be loaded (from SML location 98) into the accumulator. This produces the instruction 01 +2098. Next, x is subtracted from the accumulator. This produces the instruction 02 +3199. The value in the accumulator may be zero, positive, or negative. Since the operator is ==, we want to branch zero. First, the symbol table is searched for the branch location (60 in this case), which is not found. So, 60 is placed in the flags array at location 03, and the instruction 03 +4200 is generated (we cannot add the branch location because we have not assigned a location to line 60 in the SML array yet). The instruction counter is incremented to 04.
The compiler proceeds to the statement

25 rem   increment y

The line number 25 is inserted in the symbol table as type L and assigned SML location 04. The instruction counter is not incremented.
When the statement

30 let y = y + 1 

is tokenized, the line number 30 is inserted in the symbol table as type L and assigned SML location 04. Command let indicates that the line is an assignment statement. First, all the symbols on the line are inserted in the symbol table (if they are not already there). The integer 1 is added to the symbol table as type C and assigned SML location 97. Next, the right side of the assignment is converted from infix to postfix notation. Then the postfix expression (y 1 +) is
evaluated. Symbol y is located in the symbol table and its corresponding memory location is pushed onto the stack. Symbol 1 is also located in the symbol table and its corresponding memory location is pushed onto the stack. When the operator + is encountered, the postfix evaluator pops the stack into the right operand of the operator and pops the stack again into the left operand of the operator, then produces the SML instructions

04 +2098   (load y)

05 +3097 (add 1)

The result of the expression is stored in a temporary location in memory (96) with instruction

06 +2196   (store temporary)

and the temporary location is pushed on the stack. Now that the expression has been evaluated, the result must be stored in y (i.e., the variable on the left side of =). So, the temporary location is loaded into the accumulator and the accumulator is stored in y with the instructions

07 +2096   (load temporary)

08 +2198 (store y)

The reader will immediately notice that SML instructions appear to be redundant. We will discuss this issue shortly.
When the statement

35 rem   add y to total

is tokenized, line number 35 is inserted in the symbol table as type L and assigned location 09.
The statement

40 let t = t + y

is similar to line 30. The variable t is inserted in the symbol table as type V and assigned SML location 95. The instructions follow the same logic and format as line 30, and the instructions 09 +2095, 10 +3098, 11 +2194, 12 +2094, and 13 +2195 are generated. Note that the result of t + y is assigned to temporary location 94 before being assigned to t (95). Once again, the reader will note that the instructions in memory locations 11 and 12 appear to be redundant. Again, we will discuss this shortly.
The statement

45 rem   loop y

is a remark, so line 45 is added to the symbol table as type L and assigned SML location 14.
The statement

50 goto 20

transfers control to line 20. Line number 50 is inserted in the symbol table as type L and assigned SML location 14. The equivalent of goto in SML is the unconditional branch (40) instruction that transfers control to a specific SML location. The compiler searches the symbol table for line 20 and finds that it corresponds to SML location 01. The operation code (40) is multiplied by 100 and location 01 is added to it to produce the instruction 14 +4001.
The statement

55 rem   output result

is a remark, so line 55 is inserted in the symbol table as type L and assigned SML location 15.
The statement

60 print t

is an output statement. Line number 60 is inserted in the symbol table as type L and assigned SML location 15. The equivalent of print in SML is operation code 11 (write). The location of t is determined from the symbol table and added to the result of the operation code multiplied by 100.
The statement

99 end

is the final line of the program. Line number 99 is stored in the symbol table as type L and assigned SML location 16. The end command produces the SML
instruction +4300 (43 is halt in SML) which is written as the final instruction in the SML memory array.
This completes the first pass of the compiler. We now consider the second pass. The flags array is searched for values other than -1. Location 03 contains 60, so the compiler knows that instruction 03 is incomplete. The compiler completes the instruction by searching the symbol table for 60, determining its location, and adding the location to the incomplete instruction. In this case, the search determines that line 60 corresponds to SML location 15, so the completed instruction 03 +4215 is produced replacing 03 +4200. The Simple program has now been compiled successfully.
To build the compiler, you will have to perform each of the following tasks:
a) Modify the Simpletron simulator program you wrote in Exercise 5.19 to take its input from a file specified by the user (see Chapter 14). The simulator should output its results to a disk file in the same format as the screen output. Convert the simulator to be an object-oriented program. In particular, make each part of the hardware an object. Arrange the instruction types into a class hierarchy using inheritance. Then execute the program polymorphically simply by telling each instruction to execute itself with an executeInstruction message.
b) Modify the infix-to-postfix conversion algorithm of Exercise 15.12 to process multi-digit integer operands and single-letter variable name operands. Hint: Standard library function strtok can be used to locate each constant and variable in an expression, and constants can be converted from strings to integers using standard library function atoi. (Note: The data representation of
the postfix expression must be altered to support variable names and integer constants.)
c) Modify the postfix evaluation algorithm to process multi-digit integer operands and variable name operands. Also, the algorithm should now implement the "hook" discussed above so that SML instructions are produced rather than directly evaluating the expression. Hint: Standard library function strtok can be used to locate each constant and variable in an expression, and constants can be converted from strings to integers using standard library function atoi. (Note: The data representation of the postfix expression must be altered to support variable names and integer constants.)
d) Build the compiler. Incorporate parts (b) and (c) for evaluating expressions in let statements. Your program should contain a function that performs the first
pass of the compiler and a function that performs the second pass of the compiler. Both functions can call other functions to accomplish their tasks. Make your compiler as object oriented as possible.
e) Build the compiler. Incorporate parts (b) and (c) for evaluating expressions in let statements. Your program should contain a function that performs the first pass of the compiler and a function that performs the second pass of the compiler. Both functions can call other functions to accomplish their tasks. Make your compiler as object oriented as possible.


Exercise 15.28
(Optimizing the Simple Compiler) When a program is compiled and converted into SML, a set of instructions is generated. Certain combinations of instructions often repeat themselves, usually in triplets called productions. A production normally consists of three instructions such as load, add, and store. For example, Fig. 15.27 illustrates five of the SML instructions that were produced in the compilation of the program in Fig. 15.25 The first three instructions are the production that adds 1 to y. Note that instructions 06 and 07 store the accumulator value in temporary location 96, then load the value back into the accumulator so instruction 08 can store the value in location 98. Often a production is followed by a load instruction for the same location that was just stored. This code can be optimized by eliminating the store instruction and the
subsequent load instruction that operate on the same memory location, thus enabling the Simpletron to execute the program faster. Figure 15.28 illustrates the optimized SML for the program of Fig. 15.25. Note that there are four fewer instructions in the optimized code--a memory-space savings of 25%.
Modify the compiler to provide an option for optimizing the Simpletron Machine Language code it produces. Manually compare the non-optimized code with the optimized code, and calculate the percentage reduction.


Exercise 15.29
(Modifications to the Simple compiler) Perform the following modifications to the Simple compiler. Some of these modifications may also require modifications to the Simpletron Simulator program written in Exercise 5.19.
a) Allow the modulus operator (%) to be used in let statements. Simpletron Machine Language must be modified to include a modulus instruction.
b) Allow exponentiation in a let statement using ^ as the exponentiation operator. Simpletron Machine Language must be modified to include an exponentiation instruction.
c) Allow the compiler to recognize uppercase and lowercase letters in Simple statements (e.g., 'A' is equivalent to 'a'). No modifications to the Simpletron Simulator are required.
d) Allow input statements to read values for multiple variables such as input x, y. No modifications to the Simpletron Simulator are required.
e) Allow the compiler to output multiple values in a single print statement such as print a, b, c. No modifications to the Simpletron Simulator are required.
f) Add syntax-checking capabilities to the compiler so error messages are output when syntax errors are encountered in a Simple program. No modifications to the Simpletron Simulator are required.
g) Allow arrays of integers. No modifications to the Simpletron Simulator are required.
h) Allow subroutines specified by the Simple commands gosub and return. Command gosub passes program control to a subroutine and command return passes control back to the statement after the gosub. This is similar to a function
call in C++. The same subroutine can be called from many gosub commands distributed throughout a program. No modifications to the Simpletron Simulator are required.
i) Allow repetition structures of the form

for x = 2 to 10 step 2

Simple statements

next

This for statement loops from 2 to 10 with an increment of 2. The next line marks the end of the body of the for line. No modifications to the Simpletron Simulator are required.
j) Allow repetition structures of the form

for x = 2 to 10


   Simple statements


next

This for statement loops from 2 to 10 with a default increment of 1. No modifications to the Simpletron Simulator are required.
k) Allow the compiler to process string input and output. This requires the Simpletron Simulator to be modified to process and store string values. Hint: Each Simpletron word can be divided into two groups, each holding a two-digit integer. Each two-digit integer represents the ASCII decimal equivalent of a character. Add a machine language instruction that will print a string beginning at a certain Simpletron memory location. The first half of the word at that location is a count of the number of characters in the string (i.e., the length of the string). Each succeeding half word contains one ASCII character expressed as
two decimal digits. The machine language instruction checks the length and prints the string by translating each two-digit number into its equivalent character.
l) Allow the compiler to process floating-point values in addition to integers. The Simpletron Simulator must also be modified to process floating-point values.


Exercise 15.30
(A Simple interpreter) An interpreter is a program that reads a high-level language program statement, determines the operation to be performed by the statement, and executes the operation immediately. The high-level language program is not converted into machine language first. Interpreters execute slowly because each statement encountered in the program must first be deciphered. If statements are contained in a loop, the statements are deciphered each time they are encountered in the loop. Early versions of the BASIC programming language were implemented as interpreters.
Write an interpreter for the Simple language discussed in Exercise 15.26. The program should use the infix-to-postfix converter developed in Exercise 15.12 and the postfix evaluator developed in Exercise 15.13 to evaluate expressions in
a let statement. The same restrictions placed on the Simple language in Exercise 15.26 should be adhered to in this program. Test the interpreter with the Simple programs written in Exercise 15.26. Compare the results of running these programs in the interpreter with the results of compiling the Simple programs and running them in the Simpletron Simulator built in Exercise 5.19.


Exercise 15.31
(Insert/Delete Anywhere in a Linked List) Our linked list class template allowed insertions and deletions at only the front and the back of the linked list. These capabilities were convenient for us when we used private inheritance and composition to produce a stack class template and a queue class template with a minimal amount of code simply by reusing the list class template. Actually linked lists are more general that those we provided. Modify the linked list class template we developed in this chapter to handle insertions and deletions anywhere in the list.


Exercise 15.32
(List and Queues without Tail Pointers) Our implementation of a linked list ( Fig. 15.3) used both a firstPtr and a lastPtr. The lastPtr was useful for the insertAtBack and removeFromBack member functions of the List class. The insertAtBack function corresponds to the enqueue member function of the Queue class. Rewrite the List class so that it does not use a lastPtr. Thus, any operations on the tail of a list must begin searching the list from the front. Does this affect our implementation of the Queue class (Fig. 15.12)?


Exercise 15.33
Use the composition version of the stack program (Fig. 15.11) to form a complete working stack program. Modify this program to inline the member functions. Compare the two approaches. Summarize the advantages and disadvantages of inlining member functions.


Exercise 15.34
(Performance of Binary Tree Sorting and Searching) One problem with the binary tree sort is that the order in which the data is inserted affects the shape of the tree--for the same collection of data, different orderings can yield binary trees of dramatically different shapes. The performance of the binary tree sorting and searching algorithms is sensitive to the shape of the binary tree. What shape would a binary tree have if its data were inserted in increasing order? in decreasing order? What shape should the tree have to achieve maximal searching performance?


Exercise 15.35
(Indexed Lists) As presented in the text, linked lists must be searched sequentially. For large lists, this can result in poor performance. A common technique for improving list searching performance is to create and maintain an index to the list. An index is a set of pointers to various key places in the list. For example, an application that searches a large list of names could improve performance by creating an index with 26 entries--one for each letter of the alphabet. A search operation for a last name beginning with 'Y' would then first search the index to determine where the 'Y' entries begin, and then "jump into" the list at that point and search linearly until the desired name is found. This would be much faster than searching the linked list from the beginning. Use the List class of Fig. 15.3 as the basis of an IndexedList class. Write a program that
demonstrates the operation of indexed lists. Be sure to include member functions insertInIndexedList, searchIndexedList, and deleteFromIndexedList.


Figure 15.3 Manipulating a linked list.



Figure 15.9 A simple stack program.



Figure 15.11 A simple stack program using composition.



Figure 15.12 Processing a queue.



Figure 15.16 Creating and traversing a binary tree.



Figure 15.23 Calculate the squares of several integers.


Figure 15.22 Simple program that finds the larger of two integers.


Figure 15.21 Simple program that determines the sum of two integers.


Using dynamic memory allocation (instead of arrays) for data structures that grow and shrink at execution time can save memory. Keep in mind, however, that pointers occupy space, and that dynamic memory
allocation incurs the overhead of function calls.

The elements of an array are stored contiguously in memory. This allows immediate access to any array element because the address of any element can be calculated directly based on its position
relative to the beginning of the array. Linked lists do not afford such immediate "direct access" to their elements.

Insertion and deletion in a sorted array can be time consuming--all the elements following the inserted or deleted element must be shifted appropriately.

An array can be declared to contain more elements than the number of items expected, but this can waste memory. Linked lists can provide better memory utilization in these situations. Linked
lists allow the program to adapt at run time.

When memory that was dynamically allocated with new is no longer needed, use delete to return the memory to the system immediately.

Assign null (zero) to the link member of a new node. Pointers should be initialized before they are used.

* To be able to form linked data structures using pointers, self-referential classes, and recursion. * To be able to create and manipulate dynamic data structures such as linked lists, queues, stacks, and binary trees. * To understand various important applications of linked data structures. * To understand how to create reusable data structures with class templates, inheritance, and composition.
Not setting the link in the last node of a list to null (0).

Assuming that the size of a class object is simply the sum of the sizes of its data members.
Deleting memory with delete that was not allocated dynamically with new.

Not returning dynamically allocated memory when it is no longer needed can cause the system to run out of memory prematurely. This is sometimes called a "memory leak."

Trying to delete memory that has already been deleted can lead to unpredictable results at execution time.

Referring to memory that has been deleted.

Not setting the link in the bottom node of a stack to null (zero).

Not setting the link in the last node of a queue to null (zero).

Not setting to null (zero) the links in leaf nodes of a tree.

15 Data Structures
15.1Introduction
15.2Self-Referential Classes
15.3Dynamic Memory Allocation
15.4Linked Lists
15.5Stacks
15.6Queues
15.7Trees
15.8Summary
Terminology
Figures
15.1 Introduction
We have studied fixed-size data structures such as single-subscripted arrays, double-subscripted arrays and structs. This chapter introduces dynamic data structures that grow and shrink during execution. Linked lists are collections of data items "lined up in a row"--insertions and removals are made anywhere in a linked list. Stacks are important in compilers and operating systems--insertions and removals are made only at one end of a stack--its top. Queues represent waiting lines; insertions are made at the back (also referred to as the tail) of a queue, and removals are
made from the front (also referred to as the head) of a queue. Binary trees facilitate high-speed searching and sorting of data, efficient elimination of duplicate data items, representing file system directories, and compiling expressions into machine language. These data structures have many other interesting applications.
We will discuss the major types of data structures and implement programs that create and manipulate these data structures. We use classes, class templates, inheritance and composition to create and package these data structures for reusability and maintainability.
Studying this chapter is solid preparation for Chapter 20, "Standard Template Library (STL)." The STL is a major portion of the C++ Standard Library. The STL provides containers, iterators for traversing those containers and algorithms for processing the elements of those containers. You will see that the STL has taken each of the data structures we discuss here in Chapter 15 and packaged them into templatized classes. The STL code is carefully written to be portable, efficient and extensible. Once you understand the principles and construction of data structures as presented in Chapter 15, you will be able to make the best use of the prepackaged data structures, iterators and algorithms in
the STL. STL is by far the single most important enhancement to C++ in the ANSI/ISO Draft Standard. It is a world-class set of components for helping realize the vision of reuse, reuse, reuse.
The chapter examples are practical programs that you will be able to use in more advanced courses and in industry applications. The programs are especially heavy on pointer manipulation. The exercises include a rich collection of useful applications.
We encourage you to attempt the major project described in the special section entitled "Building Your Own Compiler." You have been using a compiler to translate your C++ programs to machine language so
that you could execute these programs on your computer. In this project, you will actually build your own compiler. It will read a file of statements written in a simple, yet powerful, high-level language similar to early versions of the popular language BASIC. Your compiler will translate these statements into a file of Simpletron Machine Language (SML) instructions-- SML is the language you learned in the Chapter 5 special section, "Building Your Own Computer." Your Simpletron Simulator program will then execute the SML program produced by your compiler! Implementing this project using a heavily object- oriented approach will give you a wonderful
opportunity to exercise most of what you have learned in this course. The special section carefully walks you through the specifications of the high-level language, and describes the algorithms you will need to convert each type of high-level language statement into machine language instructions. If you enjoy being challenged, you might attempt the many enhancements to both the compiler and the Simpletron Simulator suggested in the Exercises.
15.2 Self-Referential Classes
A self-referential class contains a pointer member that points to a class object of the same class type. For example, the definition

class Node {  

public:

Node( int );

void setData( int );

int getData() const;

void setNextPtr( const Node * );

const Node *getNextPtr() const;

private:

int data;

Node *nextPtr;

};

defines a type, Node. Type Node has two private data members--integer member data and pointer member nextPtr. Member nextPtr points to an object of type Node--an object of the same type as the one being declared here, hence the term "self-referential class." Member nextPtr is referred to as a link--i.e., nextPtr can be used to "tie" an object of type Node to another object of the same type. Type Node also has five member functions: a constructor that receives an integer to initialize member data, a setData function to set the value of member data, a getData function to return the value of member data, a setNextPtr function to set the
value of member nextPtr, and a getNextPtr function to return the value of member nextPtr.
Self-referential class objects can be linked together to form useful data structures such as lists, queues, stacks, and trees. Figure 15.1 illustrates two self-referential class objects linked together to form a list. Note that a slash--representing a null (0) pointer--is placed in the link member of the second self-referential class object to indicate that the link does not point to another object. The slash is only for illustration purposes; it does not correspond to the backslash character in C++. A null pointer normally indicates the end of a data structure
just as the null character ('\\0') indicates the end of a string.
Select the true statement(s).
A self-referential class contains an object of the same class type.
Self-referential class objects can be linked together to form data structures such as lists, queues, stacks, and trees.
15.3 Dynamic Memory Allocation
Creating and maintaining dynamic data structures requires dynamic memory allocation--the ability for a program to obtain more memory space at execution time to hold new nodes and to release space no longer needed. The limit for dynamic memory allocation can be as large as the amount of available physical memory in the computer or the amount of available virtual memory in a virtual memory system. Often, the limits are much smaller because available memory must be shared among many users.
Operators new and delete are essential to dynamic memory allocation. Operator new takes as an argument the type of the object being dynamically allocated and returns a pointer to an object of that type. For example, the statement

Node *newPtr = new Node( 10 );

allocates sizeof( Node ) bytes and stores a pointer to this memory in newPtr. If no memory is available, new returns a zero pointer. The 10 is the node object's data.
The delete operator deallocates memory allocated with new--i.e., the memory is returned to the system so that the memory can be reallocated in the future. To free
memory dynamically allocated by the preceding new, use the statement

delete newPtr;

Note that newPtr itself is not deleted; rather the space newPtr points to is deleted. If newPtr has the value 0 (indicating a pointer to nothing), the preceding statement has no effect.
The following sections discuss lists, stacks, queues and trees. These data structures are created and maintained with dynamic memory allocation and self-referential classes.
15.4 Linked Lists
A linked list is a linear collection of self-referential class objects, called nodes, connected by pointer links--hence, the term "linked" list. A linked list is accessed via a pointer to the first node of the list. Subsequent nodes are accessed via the link-pointer member stored in each node. By convention, the link pointer in the last node of a list is set to null (zero) to mark the end of the list. Data are stored in a linked list dynamically--each node is created as necessary. A node can contain data of any type including objects of other classes. If nodes contain base-class pointers or
base-class references to base-class and derived-class objects related by inheritance, we can have a linked list of such nodes and use virtual function calls to process these objects polymorphically. Stacks and queues are also linear data structures, and, as we will see, are constrained versions of linked lists. Trees are nonlinear data structures.
Lists of data can be stored in arrays, but linked lists provide several advantages. A linked list is appropriate when the number of data elements to be represented in the data structure at once is unpredictable. Linked lists are dynamic, so the length of a list can increase or decrease as necessary. The size of a "conventional"
C++ array, however, cannot be altered, because the array size is fixed at compile time. "Conventional" arrays can become full. Linked lists become full only when the system has insufficient memory to satisfy dynamic storage allocation requests.
Linked lists can be maintained in sorted order simply by inserting each new element at the proper point in the list. Existing list elements do not need to be moved.
Linked list nodes are normally not stored contiguously in memory. Logically, however, the nodes of a linked list appear to be contiguous. Figure 15.2 illustrates a linked list with several nodes.
The program of Fig. 15.3 (whose output is shown in Fig. 15.4) uses a List class template (see Chapter 12, "Templates") to manipulate a list of integer values and a list of floating-point values. The driver program (fig15_03.cpp) provides 5 options: 1) insert a value at the beginning of the list (function insertAtFront), 2) insert a value at the end of the list (function insertAtBack), 3) delete a value from the front of the list (function removeFromFront), 4) delete a value from the end of the list (function removeFromBack), and 5) terminate the list processing. A detailed discussion of the program follows. Exercise 15.20 asks you to implement a recursive function that prints a
linked list backwards, and Exercise 15.21 asks you to implement a recursive function that searches a linked list for a particular data item.
Figure 15.3 consists of two class templates--ListNode and List. Encapsulated in each List object is a linked- list of ListNode objects. The ListNode class template consists of private members data and nextPtr. ListNode member data stores a value of type NODETYPE, the type parameter passed to the class template. ListNode member nextPtr stores a pointer to the next ListNode object in the linked list.
The List class template consists of private members firstPtr (a pointer to the first ListNode in a List object)
and lastPtr (a pointer to the last ListNode in a List object). The default constructor initializes both pointers to 0 (null). The destructor ensures that all ListNode objects in a List object are destroyed when that List object is destroyed. The primary functions of the List class template are insertAtFront, insertAtBack, removeFromFront, and removeFromBack.
Function isEmpty is called a predicate function--it does not alter the List; rather, it determines if the List is empty (i.e., the pointer to the first node of the List is null). If the List is empty, true is returned; otherwise, false is returned. Function print displays the List's contents.
Over the next several pages, we will discuss each of the member functions of the List class in detail. Function insertAtFront (Fig. 15.5) places a new node at the front of the list. The function consists of several steps:
1. Call function getNewNode passing it value, which is a constant reference to the node value to be inserted.
2. Function getNewNode uses operator new to create a new list node and return a pointer to this list node. If this pointer is nonzero, getNewNode returns a pointer to this newly allocated node to newPtr in insertAtFront.
3. If the list is empty, then both firstPtr and lastPtr are set to newPtr.
4. If the list is not empty, then the node pointed to by newPtr is threaded into the list by copying firstPtr to newPtr->nextPtr so that the new node points to what used to be the first node of the list, and copying newPtr to firstPtr so that firstPtr now points to the new first node of the list.
Figure 15.5 illustrates function insertAtFront. Part a) of the figure shows the list and the new node before the insertAtFront operation. The dotted arrows in part b) illustrate the steps 2 and 3 of the insertAtFront operation that enable the node containing 12 to become the new list front.
Function insertAtBack (Fig. 15.6) places a new node at the back of the list. The function consists of several steps:
1. Call function getNewNode passing it value which is a constant reference to the node value to be inserted.
2. Function getNewNode uses operator new to create a new list node and return a pointer to this list node. If this pointer is nonzero, getNewNode returns a pointer to this newly allocated node to newPtr in insertAtBack.
3. If the list is empty, then both firstPtr and lastPtr are set to newPtr.
4. If the list is not empty, then the node pointed to by newPtr is threaded into the list by copying newPtr into lastPtr->nextPtr so that the new node is pointed to by what used to be the last node of the list, and copying newPtr to lastPtr so that lastPtr now points to the new last node of the list.
Figure 15.6 illustrates an insertAtBack operation. Part a) of the figure shows the list and the new node before the operation. The dotted arrows in part b) illustrate the steps of function insertAtBack that enable a new node to be added to the end of a list that is not empty.
Function removeFromFront (Fig. 15.7) removes the front node of the list and copies the node value to the
reference parameter. The function returns false if an attempt is made to remove a node from an empty list, and returns true if the removal is successful. The function consists of several steps:
1. Instantiate tempPtr as a copy of firstPtr. Eventually, tempPtr will be used to delete the memory space for the node being removed.
2. If firstPtr is equal to lastPtr, i.e., if the list has only one element prior to the removal attempt, then set firstPtr and lastPtr to zero to dethread that node from the list (leaving the list empty).
3. If the list has more than one node prior to removal, then leave lastPtr as is and simply set firstPtr to firstPtr->nextPtr, i.e., modify firstPtr to point to what was the second node prior to removal (and now, the new first node).
4. After all these pointer manipulations are complete, copy to reference parameter value the data member of the node being removed.
5. Now delete the memory space for the node pointed to by tempPtr.
6. Return true indicating successful removal.
Figure 15.7 illustrates function removeFromFront. Part a) illustrates the list before the removal operation. Part b) shows actual pointer manipulations.
Function removeFromBack (Fig. 15.8) removes the back node of the list and copies the node value to the reference parameter. The function returns false if an attempt is made to remove a node from an empty list, and returns true if the removal is successful. The function consists of several steps:
1. Instantiate tempPtr as a copy of lastPtr. Eventually, tempPtr will be used to delete the memory space for the node being remove.
2. If firstPtr is equal to lastPtr, i.e., if the list has only one element prior to the removal attempt, then set firstPtr and lastPtr to zero to dethread that node from the list (leaving the list empty).
3. If the list has more than one node prior to removal, then instantiate currentPtr as a copy of firstPtr.
4. Now "walk the list" with currentPtr until it points to the node before the last node. This is done with a while loop that keeps replacing currentPtr by currentPtr- >nextPtr while currentPtr->nextPtr is not lastPtr.
5. Copy currentPtr to lastPtr to dethread the back node from the list.
6. Set the currentPtr->nextPtr to zero in the new last node of the list.
7. After all the pointer manipulations are complete, copy to reference parameter value the data member of the node being removed.
8. Now delete the memory space for the node pointed to by tempPtr.
9. Return true indicating successful removal.
Figure 15.8 illustrates function removeFromFront. Part a) of the figure illustrates the list before the
removal operation. Part b) of the figure shows the actual pointer manipulations.
Function print first determines if the list is empty. If so, print prints "The list is empty" and terminates. Otherwise, it prints the data in the list. The function initializes currentPtr as a copy of firstPtr and then prints the string "The list is: ". While currentPtr is not null, currentPtr->data is printed and the value of currentPtr->nextPtr is assigned to currentPtr. Note that if the link in the last node of the list is not null, the printing algorithm will erroneously print past the end of the list. The printing algorithm is identical for linked lists, stacks, and queues.
The kind of linked list we have been discussing is a singly-linked list--the list begins with a pointer to the first node and each node contains a pointer to the next node "in sequence." This list terminates with a node whose pointer member is 0. A singly-linked list may be traversed in only one direction.
A circular, singly-linked list begins with a pointer to the first node and each node contains a pointer to the next node. The "last node" does not contain a 0 pointer; rather, the pointer in the last node points back to the first node, thus closing the "circle."
A doubly-linked list allows traversals both forwards and backwards. Such a list is often implemented with two
"start pointers"--one that points to the first element of the list to allow front-to-back traversal of the list, and one that points to the last element of the list to allow back-to-front traversal of the list. Each node has both a forward pointer to the next node in the list in the forward direction and a backward pointer to the next node in the list in the backward direction. If your list contains an alphabetized telephone directory, for example, searching for someone whose name begins with a letter near the front of the alphabet might begin from the front of the list. Searching for someone whose name begins with a letter near the end of the alphabet might begin from the back of the list.
In a circular, doubly-linked list, the forward pointer of the last node points to the first node and the backward pointer of the first node points to the last node, thus closing the "circle."
Select the true statement(s).
A circular, singly-linked list begins with a pointer to the first node and each node contains a pointer to the next node. The last node points to the first node.
A linked list is a non-linear collection of self-referential class objects connected by pointer links.
A node in a list can contain data of any type.
15.5 Stacks
In Chapter 12, "Templates," we explained the notion of a stack class template with an underlying array implementation. In this section, we use an underlying pointer-based linked-list implementation. We also discuss stacks in Chapter 20, "Standard Template Library (STL)."
A stack is a constrained version of a linked list--new nodes can be added to a stack and removed from a stack only at the top. For this reason, a stack is referred to as a last-in, first-out (LIFO) data structure. The link member
in the last node of the stack is set to null (zero) to indicate the bottom of the stack.
The primary member functions used to manipulate a stack are push and pop. Function push adds a new node to the top of the stack. Function pop removes a node from the top of the stack, stores the popped value in a reference variable that is passed to the calling function, and returns true if the pop operation was successful (false otherwise).
Stacks have many interesting applications. For example, when a function call is made, the called function must know how to return to its caller, so the return address is pushed onto a stack. If a series of
function calls occurs, the successive return values are pushed onto the stack in last-in, first-out order so that each function can return to its caller. Stacks support recursive function calls in the same manner as conventional nonrecursive calls.
Stacks contain the space created for automatic variables on each invocation of a function. When the function returns to its caller, the space for that function's automatic variables is popped off the stack, and those variables are no longer known to the program.
Stacks are used by compilers in the process of evaluating expressions and generating machine language code. The Exercises explore several
applications of stacks, including using them to develop a complete working compiler.
We will take advantage of the close relationship between lists and stacks to implement a stack class primarily by reusing a list class. We use two different forms of reusability. First, we implement the stack class through private inheritance of the list class. Then we implement an identically performing stack class through composition by including a list class as a private member of a stack class. Of course, all of the data structures in this chapter, including these two stack classes, are implemented as templates (see Chapter 12, "Templates") to encourage further reusability.
The program of Fig. 15.9 (whose output is shown in Fig. 15.10) creates a Stack class template primarily through private inheritance of the List class template Fig. 15.3. We want the Stack to have member functions push, pop, isStackEmpty and printStack. Note that these are essentially the insertAtFront, removeFromFront, isEmpty and print functions of the List class template. Of course, the List class template contains other member functions (i.e., insertAtBack and removeFromBack) that we would not want to make accessible through the public interface to the Stack class. So when we indicate that the Stack class template is to inherit the List class
template, we specify private inheritance. This makes all the List class template's member functions private in the Stack class template. When we implement the Stack's member functions, we then simply have each of these call the appropriate member function of the List class--push calls insertAtFront, pop calls removeFromFront, isStackEmpty calls isEmpty and printStack calls print.
The stack class template is used in main to instantiate integer stack intStack of type Stack< int >. Integers 0 through 3 are pushed onto intStack and then popped off intStack. The stack class template is then used to instantiate float stack doubleStack of type Stack<
double >. Values 1.1, 2.2, 3.3 and 4.4 are pushed onto doubleStack and then popped off doubleStack.
Another way to implement a Stack class template is by reusing a List class template through composition. The program of Fig. 15.11 uses the files list.h and listnd.h from the List program. It also uses the same driver program as the previous Stack program, except the new header file--stack_c.h--is included and replaces stack.h. The output is also the same. The Stack class template definition now includes member object s of type List< STACKTYPE >.
Select the true statement(s).
A stack is a first-in, first-out (FIFO) data structure.
The primary operations used to manipulate a stack are push and pop.
15.6 Queues
A queue is similar to a supermarket checkout line--the first person in line is serviced first and other customers enter the line at the end and wait to be serviced. Queue nodes are removed only from the head of the queue and are inserted only at the tail of the queue. For this reason, a queue is referred to as a first-in, first-out (FIFO) data structure. The insert and remove operations are known as enqueue and dequeue.
Queues have many applications in computer systems. Most computers have only a single processor, so only one user at a time can be serviced. Entries for the other
users are placed in a queue. Each entry gradually advances to the front of the queue as users receive service. The entry at the front of the queue is the next to receive service.
Queues are also used to support print spooling. A multiuser environment may have only a single printer. Many users may be generating outputs to be printed. If the printer is busy, other outputs may still be generated. These are "spooled" to disk (much as thread is wound onto a spool) where they wait in a queue until the printer becomes available.
Information packets also wait in queues in computer networks. Each time a packet arrives at a network node,
it must be routed to the next node on the network along the path to the packet's final destination. The routing node routes one packet at a time, so additional packets are enqueued until the router can route them.
A file server in a computer network handles file access requests from many clients throughout the network. Servers have a limited capacity to service requests from clients. When that capacity is exceeded, client requests wait in queues.
The program of Fig. 15.12 (whose output is shown in Fig. 15.13) creates a Queue class template primarily through private inheritance of the List class template of Fig 15.3. We want the Queue to have member
functions enqueue, dequeue, isQueueEmpty and printQueue. We note that these are essentially the insertAtBack, removeFromFront, isEmpty and print functions of the List class template. Of course, the List class template contains other member functions (i.e., insertAtFront and removeFromBack) that we would not want to make accessible through the public interface to the Queue class. So when we indicate that the Queue class template is to inherit the List class template, we specify private inheritance. This makes all the List class template's member functions private in the Queue class template. When we implement the Queue's member functions, we simply have each of
these call the appropriate member function of the list class--enqueue calls insertAtBack, dequeue calls removeFromFront, isQueueEmpty calls isEmpty and printQueue calls print.
The queue class template is used in main to instantiate integer queue intQueue of type Queue< int >. Integers 0 through 3 are enqueued to intQueue and then dequeued from intQueue in first-in-first-out order. The queue class template is then used to instantiate float queue doubleQueue of type Queue< double >. Values 1.1, 2.2, 3.3 and 4.4 are enqueued to doubleQueue and then dequeued from doubleQueue in first-in-first-out order.
Select the true statement(s).
A queue is a first-in, first-out (FIFO) data structure.
The primary operations used to manipulate a queue are enqueue and dequeue.
15.7 Trees
Linked lists, stacks, and queues are linear data structures. A tree is a nonlinear, two-dimensional data structure with special properties. Tree nodes contain two or more links. This section discusses binary trees (Fig. 15.14)--trees whose nodes all contain two links (none, one, or both of which may be null). The root node is the first node in a tree. Each link in the root node refers to a child. The left child is the root node of the left subtree, and the right child is the root node of the right subtree. The children of a node are called siblings. A node with no children is called a leaf node.
Computer scientists normally draw trees from the root node down--exactly the opposite of trees in nature.
In this section, a special binary tree called a binary search tree is created. A binary search tree (with no duplicate node values) has the characteristic that the values in any left subtree are less than the value in its parent node, and the values in any right subtree are greater than the value in its parent node. Figure 15.15 illustrates a binary search tree with 12 values. Note that the shape of the binary search tree that corresponds to a set of data can vary, depending on the order in which the values are inserted into the tree.
The program of Fig. 15.16 (whose output is shown in Fig. 15.17) creates a binary search tree and traverses it (i.e., walks through all its nodes) three ways--using recursive inorder, preorder, and postorder traversals.
Function main begins by instantiating integer tree intTree of type Tree<int>. The program prompts for 10 integers, each of which is inserted in the binary tree through a call to insertNode. The program then performs preorder, inorder and postorder traversals (these are explained shortly) of intTree. The program then instantiates floating-point tree doubleTree of type Tree<double>. The program prompts for 10 double values, each of which is inserted in the binary tree
through a call to insertNode. The program then performs preorder, inorder, and postorder traversals of doubleTree.
Now, we discuss the class template definitions. We begin with the TreeNode class template that declares as its friend the Tree class template. Class TreeNode has as private data the node's data value, and pointers leftPtr (to the node's left subtree) and rightPtr (to the node's right subtree). The constructor sets data to the value supplied as a constructor argument, and sets pointers leftPtr and rightPtr to zero (thus initializing this node to be a leaf node). Member function getData returns the data value.
The Tree class has as private data rootPtr, a pointer to the root node of the tree. The class has public member functions insertNode (that inserts a new node in the tree,) and preorderTraversal, inorderTraversal and postorderTraversal, each of which walks the tree in the designated manner. Each of these member functions calls its own separate recursive utility function to perform the appropriate operations on the internal representation of the tree. The Tree constructor initializes rootPtr to zero to indicate that the tree is initially empty.
The Tree class's utility function insertNodeHelper recursively inserts a node into the tree. A node can only
be inserted as a leaf node in a binary search tree. If the tree is empty, a new TreeNode is created, initialized, and inserted in the tree.
If the tree is not empty, the program compares the value to be inserted with the data value in the root node. If the insert value is smaller, the program recursively calls insertNodeHelper to insert the value in the left subtree. If the insert value is larger, the program recursively calls insertNodeHelper to insert the value in the right subtree. If the value to be inserted is identical to the data value in the root node, the program prints the message " dup" and returns without inserting the duplicate value into the tree.
Each of the member functions inOrderTraversal, preOrderTraversal, and postOrderTraversal traverse the tree (Fig. 15.18) and print the node values.
The steps for an inOrderTraversal are:
1. Traverse the left subtree with an inOrderTraversal.
2. Process the value in the node (i.e., print the node value).
3. Traverse the right subtree with an inOrderTraversal.
The value in a node is not processed until the values in its left subtree are processed. The inOrderTraversal of the tree in Fig. 15.18 is:

6 13 17 27 33 42 48

Note that the inOrderTraversal of a binary search tree prints the node values in ascending order. The process of creating a binary search tree actually sorts the data-- and thus this process is called the binary tree sort.
The steps for a preOrderTraversal are:
1. Process the value in the node.
2. Traverse the left subtree with a preOrderTraversal.
3. Traverse the right subtree with a preOrderTraversal.
The value in each node is processed as the node is visited. After the value in a given node is processed, the values in the left subtree are processed, then the values
in the right subtree are processed. The preOrderTraversal of the tree in Fig. 15.18 is:

27 13 6 17 42 33 48

The steps for a postOrderTraversal are:
1. Traverse the left subtree with a postOrderTraversal.
2. Traverse the right subtree with a postOrderTraversal.
3. Process the value in the node.
The value in each node is not printed until the values of its children are printed. The postOrderTraversal of the tree in Fig. 15.18 is:

6 17 13 33 48 42 27

The binary search tree facilitates duplicate elimination. As the tree is being created, an attempt to insert a duplicate value will be recognized because a duplicate will follow the same "go left" or "go right" decisions on each comparison as the original value did. Thus, the duplicate will eventually be compared with a node containing the same value. The duplicate value may simply be discarded at this point.
Searching a binary tree for a value that matches a key value is also fast. If the tree is tightly packed, then each level contains about twice as many elements as the previous level. So a binary search tree with n elements would have a maximum of log2n levels, and thus a
maximum of log2n comparisons would have to be made either to find a match or to determine that no match exists. This means, for example, that when searching a (tightly packed) 1000-element binary search tree, no more than 10 comparisons need to be made because 210 > 1000. When searching a (tightly packed) 1,000,000- element binary search tree, no more than 20 comparisons need to be made because 220 > 1,000,000.
In the Exercises, algorithms are presented for several other binary tree operations such as deleting an item from a binary tree, printing a binary tree in a two- dimensional tree format, and performing a level-order traversal of a binary tree. The level-order traversal of a
binary tree visits the nodes of the tree row-by-row starting at the root node level. On each level of the tree, the nodes are visited from left to right. Other binary tree exercises include allowing a binary search tree to contain duplicate values, inserting string values in a binary tree, and determining how many levels are contained in a binary tree.
Select the true statement(s).
A tree is a non-linear data structure.
Tree nodes can contain two or more links.
The root node is the first node in the tree.
A leaf node is a node containing children.
The preorder traversal of a binary search tree traverses the elements of the tree in sorted order.inorder
15.8 Summary
* Self-referential classes contain members called links that point to objects of the same class type. * Self-referential classes enable many class objects to be linked together in stacks, queues, lists, and trees. * Dynamic memory allocation reserves a block of bytes in memory to store an object during program execution. * A linked list is a linear collection of self-referential class objects. * A linked list is a dynamic data structure--the length of the list can increase or decrease as necessary.
* Linked lists can continue to grow until memory is exhausted. * Linked lists provide a mechanism for simple insertion and deletion of data by pointer manipulation. * A singly-linked list begins with a pointer to the first node and each node contains a pointer to the next node "in sequence." This list terminates with a node whose pointer member is 0. A singly-linked list may be traversed in only one direction. * A circular, singly-linked list begins with a pointer to the first node and each node contains a pointer to the next node. The pointer in the last node points back to the first node, thus closing the "circle." * A doubly-linked list allows traversals both forwards and backwards. Each node has both a forward pointer to the next node in the list in the forward direction, and a backward pointer to the next node in the list in the backward direction. * In a circular, doubly-linked list, the forward pointer of the last node points to the first node and the backward pointer of the first node points to the last node, thus closing the "circle." * Stacks and queues are constrained versions of linked lists. * New stack nodes are added to a stack and are removed from a stack only at the top of the stack. For * this reason, a stack is referred to as a last-in, first-out (LIFO) data structure. * The link member in the last node of the stack is set to null (zero) to indicate the bottom of the stack. * The two primary operations used to manipulate a stack are push and pop. The push operation creates a new node and places it on the top of the stack. The pop operation removes a node from the top of the stack, deletes the memory that was allocated to the popped node, and returns the popped value. * In a queue data structure, nodes are removed from the head and added to the tail. For this reason, a queue is referred to as a first-in, first-out (FIFO) data struc * ture. The add and remove operations are known as enqueue and dequeue. * Trees are two-dimensional data structures requiring two or more links per node. * Binary trees contain two links per node. * The root node is the first node in the tree. * Each of the pointers in the root node refers to a child. The left child is the first node in the left subtree, and the right child is the first node in the right subtree. The children of a node are called siblings. Any tree node that does not have any children is called a leaf node. * A binary search tree has the characteristic that the value in the left child of a node is less than the value in * its parent node, and the value in the right child of a node is greater than or equal to the value in its parent node. If there are no duplicate data values, the value in the right child is simply greater than the value in its parent node. * An inorder traversal of a binary tree traverses the left subtree inorder, processes the value in the root node, then traverses the right subtree inorder. The value in a node is not processed until the values in its left subtree are processed. * A preorder traversal processes the value in the root node, traverses the left subtree preorder, then traverses the right subtree preorder. The value in each node is processed as the node is encountered. * A postorder traversal traverses the left subtree postorder, traverses the right subtree postorder, then processes the value in the root node. The value in each node is not processed until the values in both its subtrees are processed.
This chapter does not contain any Testing and Debugging tips.
This chapter does not contain any Applet Examples.
This chapter does not contain any Software Engineering Observations.